Search CORE

1,625 research outputs found

Online Importance Weight Aware Updates

Author: Karampatziakis Nikos
Langford John
Publication venue
Publication date: 01/01/2011
Field of study

An importance weight quantifies the relative importance of one example over another, coming up in applications of boosting, asymmetric classification costs, reductions, and active learning. The standard approach for dealing with importance weights in gradient descent is via multiplication of the gradient. We first demonstrate the problems of this approach when importance weights are large, and argue in favor of more sophisticated ways for dealing with them. We then develop an approach which enjoys an invariance property: that updating twice with importance weight

h

is equivalent to updating once with importance weight

2h

. For many important losses this has a closed form update which satisfies standard regret guarantees when all examples have

h=1

. We also briefly discuss two other reasonable approaches for handling large importance weights. Empirically, these approaches yield substantially superior prediction with similar computational performance while reducing the sensitivity of the algorithm to the exact setting of the learning rate. We apply these to online active learning yielding an extraordinarily fast active learning algorithm that works even in the presence of adversarial noise

arXiv.org e-Print Archive

CiteSeerX

Slow Learners are Fast

Author: Langford John
Smola Alexander
Zinkevich Martin
Publication venue
Publication date: 01/01/2009
Field of study

Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online learning.Comment: Extended version of conference paper - NIPS 200

arXiv.org e-Print Archive

CiteSeerX

A Contextual Bandit Bake-off

Author: Agarwal Alekh
Bietti Alberto
Langford John
Publication venue
Publication date: 24/01/2020
Field of study

Contextual bandit algorithms are essential for solving many real-world interactive machine learning problems. Despite multiple recent successes on statistically and computationally efficient methods, the practical behavior of these algorithms is still poorly understood. We leverage the availability of large numbers of supervised learning datasets to empirically evaluate contextual bandit algorithms, focusing on practical methods that learn by relying on optimization oracles from supervised learning. We find that a recent method (Foster et al., 2018) using optimism under uncertainty works the best overall. A surprisingly close second is a simple greedy baseline that only explores implicitly through the diversity of contexts, followed by a variant of Online Cover (Agarwal et al., 2014) which tends to be more conservative but robust to problem specification by design. Along the way, we also evaluate various components of contextual bandit algorithm design such as loss estimators. Overall, this is a thorough study and review of contextual bandit methodology

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server